Learning properties of Noun Phrases: from data to functions

نویسندگان

  • Valeria Quochi
  • Basilio Calderone
چکیده

Abstract The paper presents two experiments of unsupervised classification of Italian noun phrases. The goal of the experiments is to identify the most prominent contextual properties that allow for a functional classification of noun phrases. For this purpose, we used a Self Organizing Map is trained with syntactically-annotated contexts containing noun phrases. The contexts are defined by means of a set of features representing morpho-syntactic properties of both nouns and their wider contexts. Two types of experiments have been run: one based on noun types and the other based on noun tokens. The results of the type simulation show that when frequency is the most prominent classification factor, the network isolates idiomatic or fixed phrases. The results of the token simulation experiment, instead, show that, of the 36 attributes represented in the original input matrix, only a few of them are prominent in the re-organization of the map. In particular, key features in the emergent macro-classification are the type of determiner and the grammatical number of the noun. An additional but not less interesting result is an organization into semantic/pragmatic micro-classes. In conclusions, our result confirm the relative prominence of determiner type and grammatical number in the task of noun (phrase) categorization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of Articles in Learning English as a Foreign Language: A Study of Iranian English Undergraduates

The significance of error analysis for the learner, the teacher and the researcher is now widely recognized. Earlier studies of error analysis concentrated on intersystematic comparison of the “native language” and the “target language” and drew the required data largely from intuitions and impressionistic observations. This study was conducted on the basis of the following observations: (1) to...

متن کامل

A Machine-Learning Approach to Estimating the Referential Properties of Japanese Noun Phrases

The referential properties of noun phrases in the Japanese language, which has no articles, are useful for article generation in JapaneseEnglish machine translation and for anaphora resolution in Japanese noun phrases. They are generally classified as generic noun phrases, definite noun phrases, and indefinite noun phrases. In the previous work, referential properties were estimated by developi...

متن کامل

A Machine Learning Approach to Coreference Resolution of Noun Phrases

In this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not restrict the entity types of the noun phrases; that is, coreference is assigned whether they are o...

متن کامل

Noun Phrase Recognition by System Combination

The performance of machine learning algorithms can be improved by combining the output of different systems. In this paper we apply this idea to the recognition of noun phrases. We generate different classifiers by using different representations of the data. By combining the results with voting techniques described in (Van Halteren et al., 1998) we manage to improve the best reported performan...

متن کامل

Corpora and Cognition: The Semantic Composition of Adjectives and Nouns in the Human Brain

The action of reading, understanding and combining words to create meaningful phrases comes naturally to most people. Still, the processes that govern semantic composition in the human brain are not well understood. In this thesis, we explore semantics (word meaning) and semantic composition (combining the meaning of multiple words) using two data sources: a large text corpus, and brain recordi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008